TAPE: Temporal Attention-Based Probabilistic Human Pose and Shape Estimation

نویسندگان

چکیده

Reconstructing 3D human pose and shape from monocular videos is a well-studied but challenging problem. Common challenges include occlusions, the inherent ambiguities in 2D to mapping computational complexity of video processing. Existing methods ignore reconstruction provide single deterministic estimate for pose. In order address these issues, we present Temporal Attention based Probabilistic Estimation method (TAPE) that operates on an RGB video. More specifically, propose use neural network encode frames temporal features using attention-based network. Given features, output per-frame temporally-informed probability distribution Normalizing Flows. We show TAPE outperforms state-of-the-art standard benchmarks serves as effective video-based prior optimization-based estimation. Code available at: https: //github.com/nikosvasilik/TAPE

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Probabilistic Mapping of Human Visual Attention from Head Pose Estimation

Effective interaction between a human and a robot requires the bidirectional perception and interpretation of actions and behavior. While actions can be identified as a directly observable activity, this might not be sufficient to deduce actions in a scene. For example, orienting our face toward a book might suggest the action toward “reading.” For a human observer, this deduction requires the ...

متن کامل

Probabilistic Temporal Head Pose Estimation Using a Hierarchical Graphical Model

We present a hierarchical graphical model to probabilistically estimate head pose angles from real-world videos, that leverages the temporal pose information over video frames. The proposed model employs a number of complementary facial features, and performs feature level, probabilistic classifier level and temporal level fusion. Extensive experiments are performed to analyze the pose estimati...

متن کامل

Towards Accurate Markerless Human Shape and Pose Estimation over Time

Existing markerless motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, limiting their application scenarios. Here we present a fully automatic method that, given multi-view videos, estimates 3D human pose and body shape. We take the recently proposed SMPLify method [12] as the base method and extend it in several ways. First we fit a 3D h...

متن کامل

Towards Accurate Markerless Human Shape and Pose Estimation over Time

We address the problem of accurately estimating human shape, pose, and motion from images and video without markers or special cameras. Existing methods often assume known backgrounds, static cameras, and sequence specific motion priors. Here we propose a method that is fully automatic and, given multi-view video, estimates 3D human motion and body shape. Our work is built upon the recent SMPLi...

متن کامل

Pose-conditioned Spatio-Temporal Attention for Human Action Recognition

We address human action recognition from multi-modal video data involving articulated pose and RGB frames and propose a two-stream approach. The pose stream is processed with a convolutional model taking as input a 3D tensor holding data from a sub-sequence. A specific joint ordering, which respects the topology of the human body, ensures that different convolutional layers correspond to meanin...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2023

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-031-31438-4_28